Parallel univariate decision trees
نویسندگان
چکیده
Univariate decision tree algorithms are widely used in Data Mining because (i) they are easy to learn (ii) when trained they can be expressed in rule based manner. In several applications mainly including Data Mining, the dataset to be learned is very large. In those cases it is highly desirable to construct univariate decision trees in reasonable time. This may be accomplished by parallelizing univariate decision tree algorithms. In this paper, we first present two different univariate decision tree algorithms C4.5 and univariate Linear Discriminant Tree. We show how to parallelize these algorithms in three ways: (i) feature based, (ii) node based (iii) data based manners. Experimental results show that performance of the parallelizations highly depend on the dataset and the node based parallelization demonstrate good speedups.
منابع مشابه
Global Induction of Decision Trees
Decision trees are, besides decision rules, one of the most popular forms of knowledge representation in Knowledge Discovery in Databases process (Fayyad, Piatetsky-Shapiro, Smyth & Uthurusamy, 1996) and implementations of the classical decision tree induction algorithms are included in the majority of data mining systems. A hierarchical structure of a tree-based classifier, where appropriate t...
متن کاملGlobal Induction of Decision Trees: From Parallel Implementation to Distributed Evolution
In most of data mining systems decision trees are induced in a top-down manner. This greedy method is fast but can fail for certain classification problems. As an alternative a global approach based on evolutionary algorithms (EAs) can be applied. We developed Global Decision Tree (GDT) system, which learns a tree structure and tests in one run of the EA. Specialized genetic operators are used,...
متن کاملA framework for bottom-up induction of oblique decision trees
Decision-tree induction algorithms are widely used in knowledge discovery and data mining, specially in scenarios where model comprehensibility is desired. A variation of the traditional univariate approach is the so-called oblique decision tree, which allows multivariate tests in its non-terminal nodes. Oblique decision trees can model decision boundaries that are oblique to the attribute axes...
متن کاملDiscovery of Relevant New Features by Generating Non-Linear Decision Trees
field of manufacturing new features. Most decision tree algorithms using selective induction focus on univariate, i.e. axis-parallel tests at each internal node of a tree. Oblique decision trees use multivariate linear tests at each non-leaf node. One well-known limitation of selective induction algorithms, however, is its inadequate description of hypotheses by task-supplied original features....
متن کاملOn the VC-Dimension of Univariate Decision Trees
In this paper, we give and prove lower bounds of the VC-dimension of the univariate decision tree hypothesis class. The VC-dimension of the univariate decision tree depends on the VC-dimension values of its subtrees and the number of inputs. In our previous work (Aslan et al., 2009), we proposed a search algorithm that calculates the VC-dimension of univariate decision trees exhaustively. Using...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Pattern Recognition Letters
دوره 28 شماره
صفحات -
تاریخ انتشار 2007